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ABSTRACT 



A data fde having a plurality of data blocks is divided 
into one or more transfer units of data blocks. Before 
data storage, each transfer imit of data blocks is sub- 
jected'to its own data compression cycle to create a 
group of compressed data blocks. The size of the data 
transfer unit, in bytes, is selected to facilitate addressing 
and retrieving individual recorded groups of com- 
pressed data blocks while providing good chaimel utili- 
zation and compression efficiency. Also the data trans- 
fer unit size is selected in part based upon data storage 
efficiency, i.e. the storage of the compressed data 
should fill as many addressable data storage areas as 
possible. Upon recording each group of compressed 
data bytes, an entry is made into a file directory for 
enabling addressing the recorded compressed data 
blocks. 
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pression process to assist the host processor in its stor- 
DATA COMPRiSSION/DECOMPRESSION AND age management process. 

STORAGE OF COMPRESSED AND The function of updating a data file in this environ- 

UNCOMPRESSED DATA ON A SAME ment can not use any usual data updating process (read, 

REMOVABLE DATA STORAGE MEDIUM 5 update, write back) because the data pattern as a result 

of the update may not compress to the same degree as 
DOCUMENTS INCORPORATED BY the original data block and therefore updated com- 

REFERENCE pressed data most probably will not fit in the original 

MacLean et al U.S. Pat. No. 5,109,226 is incorporated storage space required to store the original data, 
by reference for its showing of an in line data compac- In an fixed block architecture (FBA) environment, 
tion and decompaction apparatus associated witha pe- data are recorded on a data storage medium in fixed 
ripheral data storage device. sized units of storage called sectors where each record- 

Gelb et al U.S. Pat. No. 5,018,060 titled "ALLO- ing track on the medium contains a fixed number of 
GATING DATA STORAGE SPACE OF PERIPH- such sectors. The addressing convention for optical disk 
ERAL DATA STORAGE DEVICES USING IM- devices consists of a track address on the medium and a 
PLIED ALLOCATION BASED ON USER PA- sector number of the particular track. On optical media 
RAMETERS". storage devices, each of the sectors consists of two 

Belser et al U.S. Pat No. 4,914,725 titled "TRANS- major parts; an Identification field (ID) used by the 
DUCER POSITIONING SERVO MECHANISMS device controller to locate a particular sector by a phys- 
EMPLOYING DIGITAL AND ANALOG CIR- ^0 icai address and a data field for storing data. The infor- 
CUrrS". mational content of the ID*s on hard sectored optical 

FIELD OF THE INVENTION indelibly recorded, as by a stamping/molding 

process, on the medium at the time of manufacture. 

This invention relates to data storage systems that are other data storage formats also are usable to practice 
capable of storing both compressed and uncompressed 25 ^^^^^ invention, such as the known count-key- 
data on one removable data storage volume and to data (CKD) and extended count-key-data (ECKD) 
proc^mg systems utihzmg such data storage systems. ^^^^^ magnetic disk media. 
This mvention dso relates to data storage systems that ^ ^^^-^^ ^^^^^^ ^ ^^^^ ^ 1^^^ 
mimmize wasted data storage space on a data storage Computer Standard Interface (SCSI) must pro- 
volume while stonng compressed data. 30 capability to resolve a Logical Block Address 
BACKGROUND OF THE INVENTION (LBA) used by SCSI architected direct-access data 

Many data storage media, such as data storage optical ^^'^^^ devices to address fixed sked miits of storage to 
disks, have a soiled fixed block architecture (FBA) ^ ^''''^ ^^^i^ '^^^^^^ '''''^'^ 
format. Such format is characterized in an optical disk 35 ^^l^' The SCSI atteched FBA device provides to 
by so-caUed hard sectoring the disk's single spiral track ^^''l ^ contiguous address space of N (N is a positive 

intoapluraHtyofsectors.Everyoneofthesectorshave "^^^J^^) ^^^^^^^ "^^^^L^^^Tl^- ^""^ 

identical data storage capacity, i.e. 512 bytes, 1024 ^^^^ wnting m any sequence. Each LBA dnrec- 
bytes,4096bytes,etc.BecauseoftheFBAdisksandthe ^ry stnicture (addresses ranging from 0 to N) is the 
variability of data lengths of compressed data with 40 ff^^^S J^^J^^f^ ^^^^^ retneve data 
respect to the source uncompressed data, in-line data ^ocks m the SCSI-FBA environment (some FBA de- 
compression has not been employed with FBA format- ^^es also provide the capabihty to address the storage 
ted disks. It is desired to efficiently store and enable space using the physical address), 
simple random address accessing a variable amount of can be seen from the preceding paragraphs, the 

compressed data resulting from a compressing data 45 principal problem facing a designer of a storage system 
formatted into addressable blocks. Such compressed "sing data compression techniques in the SCSI-FBA 
data are then recorded on a FBA formatted disk. If the environment is to provide a mechanism by which fixed 
sector data does not compress to a fewer bytes, then the size units of data, herein termed data blocks, in an input 
data are stored vidthout data compression on the data data stream can be recorded in a variable amount of 
storage disk. 50 medium storage space and still maintain addressability 

It is also desired to maintain host processor addressa- to the unoccupied storage space and provide for ad- 
bihty of the compressed data blocks within each com- dressability to the recorded data blocks, 
pressed group of data blocks. It is also desired when Since many optical disks today are of the removable 
compressing data for storage on a FBA storage medium type, it is further desired to enable the removable data 
to maintflm a mflTimal addressability of all unused data 55 storage medium to be self-describing as to compressed 
storing sectors even though the number of sectors re- and uncompressed data, 
quked to store the compressed data blocks b unknown DISCUSSION OF THE PRIOR ART 

A further desire is to provide for random addressmg of 

the compressed data blocks recorded in an FBA format- The Vosacek U.S. Pat. No. 4,499,539 shows first 
ted storage medium. 60 allocating a number of data storage segments of a cache 

The data pattern randomness of most input data or buffer for storing a maximum number of data bytes 
streams and the variabihty in the resulting length of the that are storable in an addressable track of a direct ac- 
compressed data output after the application of the cess storage device (DASD) connected to the cache or 
various compression algorithms, does not allow for the buffer. The DASD is a magnetic disk storage device, 
prediction of the amount of storage space required to 65 The protocol is to stage or transfer one track of DASD 
contain the compressed data. This situation requires a data to the cache or buffer in one input-output operation 
link between the transmission of the data stream to be (one access to the DASD). Upon completion of the 
compressed and recorded and the results of the com- actual data transfer, the cache or buffer is examined. If 
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less than all of the first allocated segments contain data, Each group of data blocks is separately compressed and 

then the empty allocated segments are deallocated. decompressed as one unit of data. Each such group is 

Pointers are recorded in a first one of the allocated separately transmitted between a host processor and a 

segments for pointing to additional allocated segments data storage unit, communications link, etc as one data 

that store data from the same D ASD track. In this man- 5 transfer unit (DTU). The size of the DTU, in terms of 

ner the DASD track is emulated in the cache or buffer. the number of data blocks to be included, is determined 

Co-pending commonly-assigned application for pa- empirically based upon the data storage capacity (num- 

tent Ser. No. 07/441,126, now U.S. Pat No. 5,097,261 ber of data bytes) storable in sectors of a data storage 

shows a data compaction system for a magnetic tape unit, the number of bytes in each of the data blocks of 

peripheral data storage system. Tapes do not have any 10 the data file and other system parameters. The data 

addressable data storage areas. The entire tape is for- storage of each group in compressed form in a data 

matted each time it is recorded. This formatting feature storage device is described by the data storage system to 

in magnetic tapes enables storing variably sized records the host processor, preferably by a command linked to 

as variably sized blocks of data. The storage of uncom- the host processor command effecting the data storage 
pressed and compressed data is by addressable blocks of 15 compressed form. The host processor establishes a 

such data. The application does show including a plu- directory describing the storage of each and every 

rahty of r^ords m one block of data recorded on the i3 transferred to 

tape. Another a>-pendmg con^^ ^^^^^ ^^^^ ^^^^ processor in the compressed 

tion for patent Ser NaO^^^^^ filed Jun. 28, 1989, f the compressed dam file directory accompanies 

nowU.S.PatNo.5 200,864showsamagneUcte^^^ 20 compressed groups. Retrieving compressed data 

storage system that automatically stor^ a plurahty of ^^^^ ^ ^^^^ ^ ^ ^ ^ retrieving the group of 

small records m each block of recorded data Each of ^^^^^ ^^^^^^^ ^^^^ 

the records remain mdividi^y addressable. A purpose compressed group of data blocks is transferrable 

of combmmg a plur^ty of records in one block is to ^ ^l^^ proc^i^ md data storage miits ^thout 

reduce the number of mter-block gaps for mcreasmg the 25 ""^ ]Cr^ " aiwiagc umis wituuui. 

storage caoacitv of the magnetic taoe decompression. The DTU or group receivmg data stor- 

Dala compr^ion and d^mpression algorithms and ^^^^""^ ^ ^^^^ ?t^' 

systems are well known. The MacLean patent, supra, ^,5?^??^^?^^ ^^/"^^ "^^.^^"^ f^'^T 

shows an in Une (real time) data compression/decom- t^S^^' ^""^ ^^^^^^ count-key-data 

pression system for use in Wgh speed data channels. 30 ^SPi> ™ ^'^f : , , ^ 

This system uses an algorithm shown in the Langdon, foregomg and other objects, features, and advan- 

Jr. et al U.S. Pat No. 4,467,317. Batch processed (soft- f^^es of the mvention will be apparent from the follow- 

ware) data compression and decompression is also well "^S ^^^^ particular descnpdon of preferred embodi- 

known. PKWARE, Inc., 7032 Ardara Avenue, Glen- ^^^^ ^^^^ invention, as iDustrated in the accompany- 

dale. Wis. 53209 USA provides the software programs 35 drawings. 

PKZIP for batch compression, PKUNZIP for batch DESCRIPTION OF THE DRAWINGS 
decompression among other compression-decompres- 
sion software. Another data compression-decompres- ^ ^ ^ ^^"^ illustrating data storing opera- 
sion algorithm has been used for both bateh (software ^e present invention, 
processing) and in-line (hardware-integrated semicon- 40 2 is a sunplified block diagram of a data process- 
ductor chips) processing. The known Lempel Ziv-1 '^S system in which the FIG. 1 iUustrated data storing 
data compression/decompression algorithm is used for operations may be advantageously employed, 
both in-lme (real time) and batch data compression and ^ is a diagrammatic showing of a Logical Block 
decompression. It is preferred to use the latter algo- Address (LBA) directory for illustrating identifying 
rithm. Shah and Johnson in the article DATA COM- 45 recorded compressed groups of data blocks of a data 
PRESSOR DECOMPRESSOR IC in the "1990 IEEE file- 
International Symposium on Circuits and Systems, New FIG. 4 is a flow chart showing machine operations 
Orleans, La. USA (pp 41-43) on May 1-3, 1990 describe t^iat update a compressed data file, 
an integrated circuit using the known Lempel-Ziv algo- FIG. 5 is a block diagram of a peripheral controller 
rithm mentioned above. In practicing the present inven- 50 usable in the FIGS. 2 and 4 illustrated data processing 
tion, it is preferred that a compression-decompression systems. 

algorithm that facilitates both batch and in line opera- FIG. 6 diagrammatically illustrates storing a com- 

tions be used. Of coiirse, only batch or only in line data pressed group of data blocks as shown in FIG. 1. 

compression-decompression may be used to success- FIG. 7 diagrammatically illustrates host processor 

fully practice the present invention. 55 commands using a SCSI connection to a data storage 

Images or "non-coded" data have been compressed system as shown in FIGS. 2 and 4. 

and decompressed for saving data storage space. FIG. 8A diagrammatically illustrates a file directory 

Reitsma U.S. Pat. No. 4,622,585 shows one video com- of a plurality of compressed groups of data blocks of a 

pression scheme. file, 

60 FIG. 8B diagrammatically illustrates format of a disk 



SUMMARY OF THE INVENTION 



sector. 



An object of this invention is to provide flexible data FIGS. 9-13 are flow charts showing details of the 

compression-decompression controls that enable ran- operation shown in FIG. 1. 

domly accessing compressed data through relatively FIG. 14 is a logic diagram iUustrating applying the 
simple accessing mechanisms. 65 present invention to a multi-unit data processing system 

In accordance with the present invention, a data file that has a plurality of data storage devices and host 

having a plurality of addressable data blocks is seg- processor interconnected as by a data link or local area 

mented into a plurality of groups of such data blocks. network. 
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FIG. 1 illustrates recording a data file by grouping a 
DETAILED DESCRIPTION pluraUty of data blocks of the fde into a smaller number 

Referring now more particularly to the appended of groups of compressed data blocks. Step 10 is exe- 
drawing, like numerals indicate like parts and structural cuted in a host processor 11 (FIG. 2). A data file, or part 
features in the various figures. A data file having a 5 of a data file, is identified for compressed data storage, 
plurality of data blocks is divided into one or more The data file consists of a plurality of data blocks. The 
transfer units of data blocks. Before data storage, each term data block includes data records (coded data), 
transfer unit of data blocks is subjected to its own data sub-file structures, individual images, graphs and the 
compression cycle to create a group of compressed data like, drawings and other forms of graphics, combined 
blocks. The size of the data transfer unit, in bytes, is 10 graphics (non-coded data) and text(coded data), and the 
selected to facilitate addressing and retrieving individ- like. As later detailed, the data file is divided into de- 
ual recorded groups of compressed data blocks while sired sizes of compressed data blocks for transfer as a 
providing good channel utilization and compression data transfer unit DTU to a storage unit or over a com- 
efficiency. Also the data transfer unit size is selected in munication link and for maintaining a random access 
part based upon data storage efficiency, i,e. the storage 15 capability to the recorded groups of compressed data 
of the data, after compression, should fill several alio- blocks. The size of each DTU and resultant recorded 
cated addressable data storage areas. Each of the alio- group is dependent on diverse variables, as vnH become 
cated sectors in each group is filled to capacity except apparent Completion of one execution of step 10 results 
the last sector of a group that may be partially filled. It in one such group of data blocks being selected for 
is desired to reduce the number of partially filled data 20 compression and storage. 

storage sectors for more efficientiy filling the FBA data Step 13 is executed by host processor 11 (FIG. 2). 
storage disk with data. This desire is balanced with The number of uncompressed data bytes in the DTU of 
enabling efficient random access to the compressed data data blocks (the product of the number of data blocks 
blocks stored on the FBA data storage disk. times the number of bytes in each data block) is divided 

Each stored or recorded group of compressed data 25 by the data storage capacity of one addressable data 
blocks is accessed from disk 30 as a single data unit storage area (sector of an FBA formatted disk) and 
irrespective of the number of disk 30 sectors in which rounded to a next higher integer if the product includes 
the group is recorded. Since each group of compressed a fraction. This number represents a maximum number 
data blocks is compressed in a separate data compres- of addressable data storage areas required to store the 
sion operation, all of the data in each such group must 30 data; either uncompressed or if a compression does not 
be decompressed starting with the beginning, i.e. first compress the data into fewer bytes for storage. At this 
compressed bytes, in each group. Therefore, in ran- juncture, it is not known how many addressable data 
domly accessing a compressed desired data block in a storage areas are required to store the group of data 
given group, all of the compressed data blocks of each blocks after compression. To ensure that the group of 
stored group are read from disk 30 as a single disk re- 35 data blocks is storable on the data storage medium (opti- 
cord. The single disk record is decompressed up to the cal disk 30 is used in the illustrative embodiment), a 
desired or addressed compressed data block. The de- number of the addressable data storage areas sufficient 
sired compressed data block is then decompressed for to store the entire group of compressed data blocks is 
processing. Limiting the size of the groups of com- initially determined for storing the group of data blocks 
pressed data blocks provides for quicker access to any 40 in an uncompressed form. 

desired compressed data block. This desire is balanced Step 15 is executed by both the host processor 11 and 
with a desire to maximize utilization of the disk 30 data data storage system 12. The selected DTU of data 
storage space. An example of managing these two pa- blocks is transmitted by the host processor to the data 
rameters for creating a desired size compressed data storage system. The data compression of the selected 
blocks (that varies with each application) is described 45 DTU of data blocks is compressed before storage on the 
later. data storage medium (not shown) of data storage device 

In an alternate arrangement, each data block is sepa- 21. There are several methodologies that may be em- 
rately compressed. A plurality of such separately com- ployed herein. The FIG. 1 indicated methodology re- 
pressed data blocks are combined into a single disk quires the data storage system to allocate the maximum 
record. The byte position within the single disk record 50 number of addressable data storage areas. Then the data 
for each of the separately compressed data blocks is transfer occurs requiring the data storage system to 
recorded in the single disk record. Such byte position or compress the selected DTU of data blocks just before 
offset enables addressing each of the compressed data the data are recorded on the data storage medium (not 
blocks within a group. shown) of data storage device 21. Upon completion of 

To facilitate access to the groups of compressed data 55 the compression and data storage or recording as one 
blocks, the host processor program Tnaintjim*! a direc- continuum of data, data storage system 12 determines 
tory that identifies the addressable data storage areas the number of addressable data storage areas actually 
containing the group as well as the data blocks in the used to store the compressed group of data blocks. The 
respective groups. This identification preferably takes imused but allocated addressable data storage areas are 
the form of a file directory that is maintained in host 60 then deallocated. In the event that certain data blocks 
processor 11. Such directory is also stored on the vol- compress to a greater number of bytes than the original 
ume or data storage disk containing the group(s) of or uncompressed data, then, as will become apparent, 
compressed data blocks. Preferably, the directory is the data compression step is not used. Control data are 
transmitted to the disk device as a part of each transfer recorded on the FBA disk that indicates which data are 
of a compressed file having plural groups of compressed 65 compressed and which data are not compressed. Such 
data blocks. This arrangement establishes on the FBA control data are used in retrieving data from the data 
disk a directory that effects addressability of the com- storage (FBA) disk, as will become apparent. As later 
pressed data blocks within the respective groups. detailed in this specification, step 16 data storage system 
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12 sends the storage locations of the just-recorded recorded. The details regarding the recorded group of 
group of compressed data blocks to the host processor compressed data blocks (see step 19) have been entered 
11 for inclusion in a directory of the data file to which into the later described file directory (FIG. 8A). If all of 
the recorded group of data blocks is a member. the above described machine operations have been 

As second methodology has the data compression- 5 completed, then the operation is "done", enabling exit- 
decompression performed in host processor 11. As ing to other machine operations beyond the present 
such, host processor 11 includes the data compression description. Otherwise, steps 10-19 are repeated as 
mechanism, either software or hardware, and sends the above described until all of the data have been com- 
compressed selected group of data blocks to data stor- pressed and recorded. It is to be noted that other ma- 
age system 12 for storage. In this instance, if batch com- 10 chine operations may be performed by host processor 
pression is used, host processor determines the number 11 in a multi-tasking or interrupt driven data processing 
of addressable data storage areas required for storing environment while steps 10-19 are m the process of 
the compressed group of data blocks. Host processor 11 execution as is known in the data processing art. 
then sends the required number of addressable data FIG. 2 shows a data processing system in simplified 
storage areas to data storage system 12 for allocation 15 form. Host processor 11 attaches a data storage system 
just before the compressed data are transmitted to the 12. Data storage system 12 includes a peripheral control 
data storage system. 20 that connects host processor 11 to data storage de- 

In a third methodology, the imcompressed group of vice 21. Device 21, in one embodiment of this invention, 
data blocks arejtransmitted twice by host processor 11 is a magneto-optical data storage device that operates 
to data storage system 12. A first transmission enables 20 with removable magneto-optical data storage media or 
data storage system 12 to accurately measure the num- a single mediiun (disk). As later used in this specifica- 
ber of addressable data storage areas that will be re- tion, the term programmed machine includes host pro- 
quired to store the compressed data. In the first trans- cessor 11, peripheral controller 20 and programmed 
mission the data are compressed but not recorded. The portions of data storage device 21. The compression- 
number of compressed data bytes are counted to deter- 25 decompression mechanisms are preferably in the pro- 
mine the data storage extent (number of sectors or ad- grammed machine. For in-line compression-decompres- 
dressable areas) for the compressed data. The data stor- sion, it is preferred that the compression-decompression 
age system 12 then allocates the indicated number of occur in peripheral controller 20. As later described 
contiguous sectors for receiving and storing the com- with respect to FIG. 14, the location of the compres- 
pressed data. A second transmission of the same data to 30 sion-decompression mechanism can be anywhere in the 
the data storage system 12 results in the compression programmed machine. For batch compression-decom- 
and storage of the compressed data in a data storage pression it is preferred to place the compression-decom- 
medium. pression in host processor 11. 

In each of the above described methodologies, if the FIG. 3 illustrates a logical block address (LBA) struc- 
number of bytes in the compressed file is greater than 35 ture 23 used in magneto-optical disk data storage sys- 
the number of uncompressed data bytes, then the data tems for addressing sectors of an optical disk. LBA 23 is 
are recorded in the uncompressed form. Further, when a logical to real address translation mechanism that 
updating a group of compressed data blocks, the num- enables full advantage of practicing the present inven- 
ber of compressed data bytes may exceed the capacity tion. This sector addressing is based upon the logical 
of the currentiy allocated sectors. As described later 40 addressing found in many present day optical disk data 
with respect to FIG. 4, a change in allocation of sectors storage devices. The attaching host processor 11 ad- 
for storing the updating DTU may be required. dresses data on the data storage medium (not shown) of 

Also, in each of the above described methodologies, data storage device 21 using a logical block address 
the data blocks to be compressed and stored from each included in LBA 23. LBA 23 determines which of the 
DTU are preferably compressed and stored as one 45 addressable physical data storage addressable areas, 
group. That is, all data blocks in each DTU are com- such as sectors, are addressed by the respective LBA 
pressed during one data compression cycle to produce address. In an alternate addressing arrangement, host 
one group of compressed data blocks. An alternate data processor 11 requests access to a named tile. This alter- 
compression approach is to individually compress each nate addressing arrangement includes host processor 11 
of the data blocks in each DTU. Then the group of 50 identifying byte location within the file to begin a data 
compressed data blocks consists of a plurality of indi- operation and a number of bytes (byte length) to be 
vidually compressed data blocks. In the alternative data subjected to the data operation, i.e. read from the disk, 
compression, a header in each group can identify the for example. 

byte offset within each group of the individually com- LBA 23 is managed by either one of two algorithms, 
pressed data blocks. Such individually compressed data 55 A tirst one has been used for optical disks. In this algo- 
blocks may also be identified on the data recordmg disk rithm, the number of entries in LBA 23 is constant for 
by illegal recording code characters, such characters each disk and is based upon the n\miber of addressable 
are well known for diverse data recording codes. entities in the disk deagnated for storing data. Spare 

Host processor 11 in step 19 logically associates all addressable data storage areas or sectors are not in- 
recorded groups of compressed data blocks via a later 60 eluded in the LBA 23 logical address sequence, as is 
described file directory. When employing the above known. Known secondary pointers enable addressing 
described first methodology, upon storing the com- spare sectors via LBA 23. 

pressed data, data storage system 12 reports to host A second algorithm for addressmg using LBA 23 is 
processor 11 the actual number of sectors used to store used in magnetic flexible diskettes. In this second algo- 
the compressed data and further address and identifying 65 rithm, the address range of LBA 23 varies with the 
data therefore, as will be described. number of demarked or unusable sectors. LBA 23 iden- 

At step 20, host processor 11 determines whether all tifies for addressing only the tracks and sectors that are 
of the data to be compressed and recorded have been designated for storing data. In the event one of the 
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sectors identifiable by the illustrated address translation 118 plus about one-half of sector 119. The remaining 
becomes unusable, then the unusable or defective sector half of last sector 119 is filled with padding bytes, as is 
is skipped and replaced by another sector. Such substi- known. Numeral 122 indicates a sector that was allo- 
tution is well known. cated previously. Numeral 123 indicates a next sector(s) 
AH of the addressable tracks and sectors on disk 30 5 that were initially allocated according to the above- 
are addressed via LBA 23. Such addressing is a table described first methodology. The linked response of 
look up matching the host processor 11 supplied logical controller 20 to the write-compress command indicates 
address to a physical disk track and sector storing the to host processor 11 that sector(s) 123 are to be deal- 
data identified by the supplied logical address. Each located as such sectors did not receive any of the data 
LBA logical address has one entry 14 in LBA 23. 10 from group 115. Host processor 11 responds to control- 
Numerals 17 and 18 indicate groups of compressed ler 20 to deallocate sectors 123. 
data blocks recorded on disk 30 using the present inven- The above description assumes that host processor 11 
tion. Numeral 17 indicates the first group of compressed is performing data space management. This arrange- 
data blocks of one file. Numeral 18 indicates subse- ment is usual. It is to be pointed out that in a multi-host 
quently recorded groups of compressed data blocks 15 arrangement of sharing device 21 that one of the hosts 
from die same file. The enumeration of the data blocks may be designated to perform space management. Also, 
in the recorded groups 17-18 is maintained in its origj- in some systems the peripheral controller performs data 
nal sequence as generated by host processor 11. As will storage space management. 

become apparent, the compressed data blocks in the FIG. 7 illustrates in abbreviated form three corn- 
respective groups are identified in a file directory 20 mands for use in a known SCSI interface. WRITE 
shown in FIG. 8A. conamand 130 includes the operation code field 131 that 
FIG. 5 illustrates a peripheral controller 20 used in an indicates the conomand is a WRITE command. LBA 
early embodiment of this invention. Such peripheral address field 132 indicates the first LBA address that 
controller 20 is interposed between host processor 11 data being transmitted in accordance with the instant 
(FIG. 2) and data storage device 21. Data storage de- 25 WRITE command is to begin (the lowest LBA address 
vice 21 may be the optical disk device shown in the of possibly several LBA addresses required to be used 
Belser et al US patent, supra. Controller 20 includes the in storing data into a plurality of disk 30 sectors). Field 
compression-decompression mechanism for in-line or 133 indicates the number of units of data that are to be 
real time data compression-<iecompression. A connec- transferred from host processor 11 to device 21 for 
tion between host processor 11 and peripheral control* 30 storage on disk 30. One unit is that data storable in one 
ler 20 is effected by a SCSI module 100 that implements sector of the disk 30. FB A disks may have different data 
the known small computer system interface. An 10 data storing capacity sectors, such as 512, 1024 (1 kb), 2048, 
buffer 103(dynaniically allocated into input data buffers or 4096 bytes of data. Field 134 indicates whether or not 
and output data biiffers using known techniques) tempo- the data to be transmitted is to be compressed. Field 135 
rarily stores data received from or to be transmitted to 35 indicates that this WRITE command is hnked to read 
the host processor 11. An Optical Disk Controller buffer command 140. This command linkage requires 
(ODC) 104 manages the reading and writing of the data peripheral controller 20 to report to host processor 11 
to a suitable optical recording disk (not shown) in data the details of the data storage, i.e. number of sectors 
storage device 21. Error Correction Control (ECC) actually used, the data that enables host processor 11 to 
module 106 detects and corrects errors in data being 40 build an entry for the later described FIG. 8A illus- 
read and generates ECC error detection and correction trated file directory, and identifies the sectors to be 
redundancy characters to be written to the medium deallocated. It is noted that LBA 23 is updated in host 
with the data. Run Length Limited (RLL) (mod- processor 11 with a copy thereof recorded in a sector of 
demod) encoding and decoding is performed in data disk 30. Also, a copy of the FIG. 8A illustrated file 
storage device 21 in a usual manner. Such mod-demod 45 directory is recorded on disk 30, preferably in a uncom- 
encodes and decodes recorded data patterns, such as pressed form at a first LBA 23 logical address that im- 
used in the known 1-7 d-k code. Microprocessor 107 mediately precedes the first LBA address for storing 
(plus control store 108 and dynamic store 109) controls compressed data. 

the various dements of the controller 20. A Compres- Read buffer SCSI command 140 includes operation 
sion/Decompression (CD) module 101, such as an inte- 50 code field 141 that indicates the command-is a READ 

grated circuit referred to by Shah et al, supra, unple- BUFFER command. Controller 20 responds to receipt 

ments the compression algorithms. CD module 101 of a READ BUFFER command to transfer data from 

includes automatic circuit timing and control, as is an output register(s) of lO buffer 103. Controller 20 

known, to control data flow through peripheral con- stores the information relating to storing a group of 
troUer 20 under supervision of microprocessor 107. This 55 compressed data blocks in such output buffer 103 regis- 

compression-decompression is in real time ^-line) with ter(s) in preparation to respond to the READ BUFFER 

the data transfer. Busses 102, 110 and 111 intercoimect command linked to the WRITE command 130. Field 

the modules, as shown. Controller 20 is preferably pack- 142 indicates to controller 20 the number of sectors used 

aged with data storage device 21 on a common frame. to store the compressed data blocks. That is, host pro- 
FTG. 6 illustrates compression of several data blocks 60 cessor 11 knows the number of disk sectors required for 

into one group of compressed data blocks recorded in a storing the compressed data blocks, hence the new 

number of data storing sectors 118 of track 117 of disk entry for the FIG. 8A illustrated file directory. 

30. A group 115 of a plurality of data blocks 116 is READ DATA command 145 has operation code 

selected for recording as described with respect to FIG. field 146 having an indication that the command is a 
1. Group 115 of compressed data blocks is transmitted 65 READ DATA command. The first LBA address to be 

to controller 20 by host processor 11. CD 101 in con- used for transferring data from disk 30 to host processor 

troUer 20 compresses group 115 sufficientiy to be re- 11 is mdicated in field 147. Field 148 indicates the num- 

corded as a group of compressed data blocks in sectors ber (n) of data blocks requested or commanded to be 
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transfened from the FBA disk to the host processor 11. 
Field 149 indicates to controller 20 the number (N) of 



disk sectors that are to be read. Field 150 indicates that Fi«""y Number data blocks in . 

. qata transfer unit 

decompress is either on or off. Link on bit 151 is USUaDy second entry Gp. I LB A Number of data Hocks and 

reset to be inactive. For reading one group of com- ^ sectors in this group 

pressed data blocks, controller 20 reads the indicated ^ block n lba n at byte B 

number (N) of sectors, decompresses the data blocks, ° i 2 ^* 

* r *i. J J J * ui 1 * I. I datablockn + 2 LBA N2 at byte B^ 

then transfers the decompressed data blocks to host (Map of aU data blocks in group (Gp.) 1 continues, term "byte" 

processor 11. Controller 20 counts the number of data indicates byte displacement of the respective compressed data 

blocks transferred such that when the indicated number block as recorded in a sector.) 

n of field 148 is reached, the data transfer is terminated. ™^ ^ ^"^^ <^ ^'^^"^ 

The data block counting is also used as an integrity 



check ***** Wocks to LBA addresses b set forth above) 

The FIG. 8A illustrated file directory can indicate ^n. • * 1 • j ^ . / 

different levels of detaU, the selected level is appUcation " T^^ «=npts merely mdicate 1st (no super 

dependent Every ffle that has data blocks recorded in f?"Pj>' "^f- "ft f ^ P°^*°"^ °^ respective 

- jj*t_iii. ^ ^ blocks n, n+l, nH-2 etc. 

groups of compressed data blocks has a separate portion tn. i J -i. j j- * . ^ 

f*t. J- * 1 • J- * u ^ I The above descnbed directory structures enable the 

of the directory respectively mdicate by numerals 161, i r 

J - t^, *j*.x=i^ u i^n contents of a smgle group of compressed data 

162 and 163 for three different data files. Each row 160 ui«.*i,« j * j^lt. * *i: •/ r j- 

^ *Ax=**-.-*" blocks to be updated without the necessity of readme 

of each directory represents one entry. A first entry m j ^l: *i. ci a j f c 

J- • 1 J • 1 ^^.1 CI At. then rewntmg the entire file. An update of a group 

each directory mclude m column 164 the filename of the ^ ui «i 1 • *i. a* r 

ft J ^1. T r»A jj t. xt- J- ^ of compressed data block only requires the reading of 

fue and the LBA address at which the directory is re- tm. j*— f 

, , , ^ , -.1. ^= X * one group of compressed data blocks. The update group 

corded on disk 30. Column 165 m the first or too most r ^jit-ti ^ 

!_ - J- * ^ v^iumii m luai uup of comprcssed data blocks may require more sectors for 
entry mdicat^e number of data blocks m each data 25 ^ ^ ^^^^ ^ ^^^^ generation 
transfer umt. ITie term data transfer umt (DTU) mdi- compressed data blocks That is additional 
cates that a given number of data blocks are to be trans- ^^^^^^ ^a^^ aUocated. Since it is desired that each 
ferred between disk 30 and host processor 11 dunng compressed data blocks are recorded in con- 
each data transfer. The remaimng entnes 160 are re- ^ .^^^^ ^ unaddressable intervening 
spectively for the transmitted and records! groups of 30 defective sectors), a new allocation may be required, 
compressed data blocks. Agam, column 164 m the re- All of this activity is explained later with respect to 
spective entnes mdicates the first LBA address used to 4 ^ost processor 11 uses this information to deter- 
store the group. Column 165 mdicates the number of next available set of contiguous LBA ad- 
data blocks recorded and the number of sectors used to Presses that have sufficient number of addresses (sec- 
store the respective groups of compressed data blocks 35 ^^^^ ^^^j^g updated group of compressed data 
on disk 30. Once all of the data blocks are compressed blocks. 

in a single data compress operation, the group of com- p^j. WORM (write once, read many) optical disks, 

pressed data blocks are a continuum of data with no the host processor may issue a MEDIUM SCAN com- 

extemal indication of the data block boundaries. The j^snd to locate the next available LBA addressed sector 

decompression mechanism and associated controls 40 storing the updated group of compressed data 

identify the data block boundaries after decompression, blocks. Host processor 11 saves this information in an 

as is known. expanded directory entry for use when the data are to 

In addition to the information contained in the FIG. retrieved or read. 

8A inustrated file directory, additional details of each As later described with respect to FIG. 10, another 

group may be provided. In such an alternate implemen- control parameter is a tniniTmitn or maximum number of 

tation of the file directory, controller 20 returns, in sectors to be used in the CKD and ECKD examples for 

addition, for each group of compressed data blocks (i.e. practicing the present invention. The number N of sec- 

for each respective entry of the FIG. 8A illustrated file tors required to store the uncompressed data is com- 

directory) a map of the relation of data blocks and data pared with a MIN (minimum value) and a MAX (maxi- 

storing sectors (uses the LBA logical address, not the mum value). If the number of required sectors is be- 

actual physical location on disk 30) for each of the tween the MIN and MAX values, then a DTU is made 

groups. This additional information is used by the host using the number N. MIN ensures a reasonable usage of 

to manage the recorded data and unused disk 30 sectors disk storage space while MAX ensures a reasonable 

indicated in LBA 23. ^2 access to compressed data blocks. If N is greater than 

All entries contain the above indicated mapping of MAX, then N is made equal to MAX. If N is less than 

data blocks to LBA addresses for each and every group MIN, then N is made equal to MIN. The number of data 

(Gp.) of compressed data blocks in the current file. That bytes in a DTU is N*SB (SB is number of bytes storable 

is, each data block is indicated as being recorded in one in one sector) for FBA devices and N*DB (DB is nmn- 

or more sectors, depending on the compression and size go ber of data bytes desired for storing one data block) for 

of the data blocks. Several compressed data blocks may CKD and ECKD devices. The number of bytes in a 

be recorded in one sector. In this instance, the LBA DTU is stored in the first or top entry 160 (FIG. 8A) of 

addresses are the same for starting and ending, i.e. each file directory. As one variation, field 166 in each of 

LBAio to LBAio for example could occur for several the entries 160 contains a compress DTU indicating bit 

data blocks. 65 C. If C is unity, then the data represented by the respec- 

A format of the FIG. 8 A Olustrated directory using tive entry 160 are recorded in a compressed form. If bit 

the additional addressing information is set forth below. C is zero or nil, then the data are recorded on disk 30 

without data compression. The compressed bit C may 
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also be recorded in each and every sector storing data in tion. Otherwise, steps 188 and 189 are repeated until all 
accordance with the present invention. DTXTs have been transmitted to data storage unit 12. 

FIG. 8B diagrammatically illustrates format of a disk FIG. 10 is a flow chart showing selecting a MIN and 
sector of an FBA disk. Sector 170 is in track 169 of disk a MAX value respectively for image (non-coded or 
30. Intersector gap 171 separates sector 170 from an 5 graphics) data and text (coded) data. The compressibil- 
immediately preceding sector (not shown). Sector ID ity of data is a measure for selecting MIN and MAX. In 
172 is an embossed area that contains the track and this regard, each file of image or text data may compress 
sector address of sector 170. Intrasector gap 173 sepa- substantially different from data from other files as well 
rates the hard sectored or embossed mark 172 from the as changing from data block to data block in either type 
magnetooptically recorded portion that constitutes the 10 of data, image or text. Once a first group of data blocks 
remainder of sector 170. Data synchronization signals have been compressed and recorded as a group of com- 
DATA SYNC 174 are magnetooptically recorded with pressed data blocks, the compression ratio may be re- 
the data stored in portion 175 of sector 170. Control corded in the FIG. 8A illustrated file directory as a 
area 176 stores magnetooptically recorded control sig- reference for subsequent compression and storage of 
nals, as may be desired. A compress bit C 177 (consid- 15 data blocks. The FIG. 10 illustration assumes that the 
ered a part of the control signals in area 176) if set to image data has been compressed 75% (compressed 
unity indicates that the data in portion 175 are com- image data blocks are 25% of original size) and text data 
press. If C 177 is set to zero or nil, then the data stored blocks have been compressed about 50%. These mea- 
m portion 175 ate not compressed. Sector 170 ends with sured values may be changed for calculation purposes 
the error detection and correction redundancy in ECC 20 for adding a margin of error acconmiodation into the 

178 portion. ECC 178 stored signals are generated and calculations. 

stored in a known manner that is not pertinent to an Step 195 determines whether the data in the file is text 
understanding of the present invention. Inter-sector gap or image. If image, step 196 calculates the MIN value as 

179 separates sector 170 from a next succeeding sector 4*SB (bytes in a sector), i.e, at least four sectors are to 
180. It is preferred that compress bit 177 be used while 25 be used for storing a group of compressed data blocks, 
practicing the present invention. The number four is selected in an arbitrary maimer. 

FIG. 9 is a flow chart showing a sequence of machine Sector size affects the minimum number of sectors to be 
operations for storing a file in a plurality of groups of used. Step 197 calculates MAX as being 64*SB. In a 
compressed data blocks wherein each group is sepa- FBA disk having 1024 byte sectors, then the maximum 
rately transmitted from a host processor to a data stor- 30 DTU size is 64 KB (Kilobytes). Again, system consider- 
age system as a DTU having a number of uncompressed ations may change these values. Such considerations are 
bytes as set forth above. At step 185 the data to be beyond the present description. From step 195, for text 
recorded is analyzed for determining the number of data (IMAGE DATA=NO), step 200 calculates MIN 
DTU's to be generated. The actual size in bytes/data as 2*SB while step 201 calculates MAX as 32*SB. The 
blocks of a DTU may be different from file to file. In 35 number of uncompressed bytes for image data in MIN 
step 186, the DTU size is modified to accommodate the and MAX is equal to the number of uncompressed bytes 
mmiber of data blocks to be initially recorded for equal- for text data. The different compression ratios change 
izing the sizes of a plurality of DTU's to be used. For MIN and MAX values inversely to the expected corn- 
example, if the number of data blocks to be compressed pression ratio. Upon completing either calculation, host 
and recorded is less than two desired DTU's and one 40 processor 11 stores the MIN and MAX values in the 
half of the number of data blocks results in a number of first entry 160 (FIG. 8A) of the appropriate file direc- 
data bytes greater than MIN, then two DTUs each tory and then exits the calculation, 
having one-half of the data blocks are created. This The MIN and MAX values may also be predeter- 
same principle is applied to transferring data blocks mined and included as parameter data defining a class of 
having any number of DTUs except for updating a 45 data as set forth in Gelb et al U.S. Pat. No. 5,018,060 
recorded group of compressed data blocks, as will be- titled "ALLOCATING DATA STORAGE SPACE 
come apparent If the DTU sizes cannot be equaUzed, OF PERIPHERAL DATA STORAGE DEVICES 
then a last DTU may have a number of bytes less than USING IMPLIED ALLOCATION BASED ON 
the MIN (minimum) number of bytes. Upon updating USER PARAMETERS". Gelb et al teach that data set 
the recorded group of compressed data blocks resulting 50 parameters implicitly control peripheral data storage 
from a small last DTU, a DTU is generated that adds a operations. Such implicit control based on data base or 
number of data blocks to make the size of the DTU, file parameter data may be applied to practicing the 
hence group of compressed data blocks, larger to meet present invention. 

the DTU size requirements set forth with respect to FIG. 11 shows execution of a WRITE command by 
FIGS. 9 and 11. FIG. 4 relating to updating a recorded 55 data storage system 12 wherein the data blocks received 
group of compressed data blocks illustrates machine in on DTU are compressed then recorded as a group of 
steps for storing an updated DTU that is too large for compressed data blocks. Step 210 receives a WRITE 
the current allocated data storage space for a recorded command 130. Step 211 sets the link commanded in 
group of compressed data blocks resulting from com- field 135 for reporting the actual number of sectors used 
pressing and storing the updated DTU. 60 to store the resultant group of compressed data blocks 

At "GET DTU" step 188 (FIG. 9), a DTU of data and a compression ratio CR achieved. Step 212 sets a 
blocks is built for data transfer. Step 189 transfers the compress mode in data storage system 12 for activating 
DTU to data storage system 12. Data storage system 12 CD 101 to compress the data blocks being received into 
compresses and stores the transferred DTU as described one continuum of compressed data. Step 213 receives, 
earlier. At step 190, host processor 11 ascertains 65 compresses and stores the DTU data blocks. Step 216 
whether another DTU is to be transferred. If not (DO- compares the number of sectors actually used to store 
NE= 1), then host processor 11 exits for performing the compressed data with the number of sectors initially 
other work not related to practicing the present inven- allocated. Step 217 compares the byte count of the 
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original data blocks in the received DTU with the byte groups of compressed data blocks with decompression, 
count of the compressed data blocks. In most instances, The term build used above indicates that the appropri- 
the byte count of the compressed data blocks will be ate control data are mserted into a READ command for 
less than the byte count of the original DTU data commandmg data storage system 12 to perform a de- 
blocks. In this instance, at step 218, data storage system 5 sired read. Such command includes the number of LB A 
12 indicates to host processor 11 that the data storage addressed sectors to be read as well as the logical ad- 
operation has been completed. The identificadon of any dress in LBA of a first one of the sectors. One READ 
unused sectors plus other information describing the command is sent by host processor 11 to data storage 
just-completed data recording operation is to be trans- system 12 in step 232, there can be a number of READ 
ferred from data storage system 12 to host processor 11. 10 commands sent for fetching a plurahty of groups of 
This transfer is effected by host processor 11 respond- record blocks. Data storage system 12 receives the 
mg to the indication of a completed recording operation READ command. At step 233, data storage system 
by issuing a READ BUFFER command 140 to data checks the sector compress bit of the first sector storing 
storage system 12 to send the number of unused alio- the requested group to be read. If bit C 177 (FIG. 8B) is 
cated sectors and all other compression information to IS unity, then the data are compressed. Data storage sys- 
host processor 11. Host processor 12 in step 219 re- tem 12 then in step 234 reads the requested group in- 
sponds to the mdication of unused allocated sectors to eluding decompressing the data. It is to be noted, that if 
deallocate such sectors for use in storing other data. the READ command field 150 indicates decompression 
Note that if tha compress bit 134 is off, then no com- is OFF, then no decompression occurs even if bit C 177 
pression occurs. 20 is set to imity. On the other hand, if bit C 177 equals 

If at step 217, it is determined that the data compres- zero (data in the sector are not compressed), the at step 
sion resulted more data bytes in the compressed data 235 data storage system 12 reads and sends die read data 
blocks than were in the original data blocks, then the without decompression to host processor 11. The FIG. 
data blocks will be recorded without data compression. 2 illustrated system exits the read operation for one 
This growth in size of the compressed data blocks may 25 group from either step 234 or 235. 
occur when the original data blocks have certain data FIG. 13 illustrates operation of data storage system 
patterns. In any event, at step 220, data storage system responding to a READ conm[iand 145, Step 236 re- 
12 sends a channel command retry (CCR) or its equiva- ceives the READ command. Step 237 checks the com- 
lent to host processor 11. CCR indicates that the DTU press field 150. If the compress field indicates that de- 
has to be retransmitted by host processor 11 to data 30 compress is ON, then C bit 177 of the sector being 
storage system 12. That is, the increased in size of the accessed is checked to ensure that the data to be read is 
DTU after compression is considered an error condi- in fact recorded and stored in a compressed form. Step 
tion. The CCR indicates that a recording error has 238 executes the READ command by decompressing 
occurred. Host processor 11 responds to the CCR at the data being read if field 150 indicates compression 
step 221 by resending the DTU to data storage system 35 and C bit 177 is ON. If the field 150 indicates decom- 
11. At step 222, data storage system 12 stores the DTU pression if OFF, the data stored in the addressed sectors 
without data compression. The above-described opera- are transferred without decompression whether com- 
tions are exited from either step 219 or 222. pressed or not That is, in all cases, data storage system 

FIG. 12 is a flow chart showing system operations for 12 transfers the data without decompression if field 150 
reading data. Host processor 11 in step 225 prepares to 40 indicates compress is OFF. This control enables trans- 
read data, i.,e. identifies the data blocks to be read. Host ferring data in either compressed or decompressed 
processor 11 then in step 226 searches for a file direc- form. 

tory (FIG. 8A). Such file directory may be read from FIG. 14 illustrates one application of the invention in 
disk 30. If there is no file directory relating to compres- a system having linked host processors. Both batch and 
sion, then the data are not compressed. Also, if the field 45 in line data compression/decompression are employed. 
166 of the FIG. 8 illustrated directory for the identified Compression-decompression software modules 251 and 
group is zero, then that group is not compressed. Fur- 273 provide batch data compression and decompression 
ther, if data to be read are compressed and it is desired while integrated circuit chips (hardware compress de- 
to decompress in a unit other ^e storing data storage compress) 253 and 272 provide in line (real time) data 
system 12, step 226 directs host processor operations to SO compression-decompression Two data processing sys- 
read all identified data without decompression via path tems 240 and 241 are linked by data link 263. Link 263 
227. From path 227, a usual data recordmg operation may be a local area network (LAN), a data conununica- 
not involving data compression is performed (not tion circuit or transfer of a removable data cartridge 
shown). Host processor 11 builds issues one READ manually or via a library, mail etc between the two data 
conunand 145 for each of the recorded groups of com- 55 processing systems. Host processor 250 in system 240 
pressed data blocks to be read. Depending on the de- has a software compress-decompress facility 251, a 
sired read operation, field 150 or READ conunand will transfer link facility 252 that involves no compression 
be set to indicate either decompress or no decompress or decompression and an in-line hardware compress- 
OFF. Host processor 11 before sending the READ decompress facility 253. Facilities 251-253 may be 
command 145 to data storage system 12 examines field 60 physically located in data processing system 240 in host 
150 at step 226. If host processor 11 at step 226 finds that processor 250 or as a part of a channel connection that 
the data to be read are compressed and decompression includes logic switch 254 (programmed or hardware) 
is desired, then step 230 sets field 150 to compress ON. connecting host processor 250 to faciUties 251-253. 
All of the groups of compressed data blocks having data Dashed line 255 indicates that switch 254 is program- 
blocks to be read are identified in step 231 via examina- 65 mingly controlled by host processor 250. A given data 
tion of the appropriate file directory 161-163. Host processing system may have only 1) batch compress 
processor 11 in step 232 then builds one or more READ facility 251 and link facility 252, 2) in-line facility 253 
commands 145 for reading the step 231 identified and link facility 252, 3) all facilities 251-253 or 4) either 
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facility 251 or 253 may be located either in data storage 
system 262 or data link 263. 

The input-output (lO) connections from facilities 
251-253 are effected by logic switch 260 that is pro- 
grammingly controlled by host processor 250 as indi- 5 
cated by dashed line 261. Switch 260 directs 10 data 
flow between facilities 251-253 and a data storage sys- 
tem 262 or data line 263. 

Data processing system 241 is shown as being identi- 
cal to data processing system 240. Data processing sys- 10 
tern 241 includes host processor 270 that may have a 
different computational arrangement and capability 
from host processor 250, logic switch 271, facilities 
272-274, data storage system 275 and switch 277 that 
selectively connects data processing system 241 to data IS 
link 263 to other systems and data processing system 
240. 

FIG. 4 illustrates updating a recorded group of com- 
pressed data blocks. Host processor 11 in stsp 280 has 
updated data blocks and desires to update a file re- 20 
corded m data storage system 12 as a plurality of groups 
of compressed data blocks. Step 281 compares the data 
length (number of uncompressed data bytes) of the 
updating DTU with the number of bytes in sectors 
currently recorded as one group to be updated. Host 25 
processor 11 also examines the number of padding bytes 
in a last sector storing compressed data for estimating 
whether or not the updated data blocks are storable in 
the currently allocated sectors for the group(s) to be 
updated. 30 

At step 282 host processor 11 determines whether or 
not the updating DTU can be stored in currently allo- 
cated sectors or if more or different sectors should be 
allocated. That is, if the updating DTU has more data 
bytes than the currently recorded group, then addi- 35 
tional sectors are allocated at step 288 (host processor 
11 does the allocation). Such new sectors are preferably 
contiguous sectors that may not include any sectors 
containing the recorded group of data blocks to be 
updated. Following allocation step 288, the updating 40 
DTU is recorded at step 289. Then, host processor 11 at 
step 290 deallocates the sectors containing the group of 
data blocks to be updated. The FIG. 2 illustrated system 
then exits the updating operation from step 290. 

If, at step 282, the number of data bytes in the updat- 45 
ing DTU is substantially equal to the number of bytes 
(uncompressed) of the recorded DTU, then the updat- 
ing occurs at step 283 using the sectors currently storing 
the group to be updated. The FIG. 2 illustrated system 
then performs step 290 before exiting the updating oper- 50 
ation. If the updating DTU has fewer bytes than the 
recorded group, then the updating DTU is recorded in 
sectors selected from the sectors containing the group 
to be updated. The sectors not used to record the updat- 
ing DTU are deallocated at step 290. 55 

It may be decided that, independently of any data 
growth patterns, to always store the updated data 
blocks in a newly allocated set of sectors and to deallo- 
cate or free the sectors storing the current group(s) of 
compressed data blocks to be updated. In this situation, 60 
steps 288-290 are performed. For example, if there is a 
desire to save the original group(s) of compressed data 
blocks, such original recording may be retained. Host 
processor 11 then updates the appropriate fde directory 
160-162 and exits the storage operation. 65 

In the updating operation shown in FIG. 4, whenever 
the compressed data has more bytes than the original 
imcompressed data, the data are recorded in an uncom- 



pressed form. The steps shown in FIG. 11 are added to 
the FIG. 4 illustrated sequence. 

While the invention has been particularly shown and 
described with reference to preferred embodiments 
thereof, it will be understood by those skilled in the art 
that various changes in form and details may be made 
therein without departing from the spirit and scope of 
the invention. 

What is claimed is: 

1. In apparatus for storing data in compressed form in 
a data storage device havmg a multiplicity of address- 
able data storage areas, each of the data storage areas 
for recording a first predetermined number of data 
bytes, the data storage device being connected to a 
programmed machine, said programmed machine for 
receiving data to be recorded, said received data being 
arranged in a plurality of addressable data blocks, the 
improvement including, in combination: 
selection means in the programmed machine for se- 
lecting a plurality of data transfer units of said data 
blocks to be recorded, each said data transfer unit 
of data blocks having a given number of data bytes 
not less than said first predetermined number and 
includes one or more of said addressable data 
blocks; 

allocation means in the programmed machine con- 
nected to the selection means for responding to said 
given number to indicate that said data transfer 
units of data blocks each requires a first number of 
said addressable data storage areas for storage in 
the data storage unit and for indicating that all of 
said first number of said indicated addressable data 
storage areas are allocated for storing data from 
respective ones of said selected data transfer units; 

compression means in the programmed machine con- 
nected to the selection means for receiving and 
compressing said data transfer units of data blocks 
into respective compressed blocks to be respec- 
tively recorded in a second number of said first 
number of addressable data storage areas, said sec- 
ond number being equal to or less than said first 
number; 

data access means in said data storage device and 
being connected to said compression means for 
respectively receiving and then respectively re- 
cording said compressed blocks in said second 
predetermined ones of said first number of said 
addressable data storage areas and indicating that 
the respective compressed block is recorded in the 
respective ones of said second predetermined ones 
of said first number of said addressable data storage 
areas; and 

directory means in the programmed machine and 
connected to said data access means and to said 
allocation means for receiving said indications of 
said allocation and said indications of said second 
ones of said first number of said addressable data 
storage areas for indicating that said compressed 
blocks are recorded in said respective second pre- 
determined ones of said first number of said ad- 
dressable data storage areas and that said recorded 
compressed blocks contain respective ones of said 
selected data transfer units of data blocks and that 
a plurality of said data transfer units of data blocks 
have been separately compressed and recorded in 
respective ones of said compressed blocks. 

2. In the apparatus set forth in claim 1 further includ- 
ing, in combination: 
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range means in said programmed machine indicating 
a range of number of bytes to be used for transfer- 
ring said said data transfer units between said pro- 
grammed machine and said data storage device 
and 5 

said selection means being connected to said range 
means for receiving said range indication and re- 
sponding to the received range indication for se- 
lecting said given number of said data blocks to be 
within said indicated range of data bytes such that 10 
each of said data transfer units has a number of 
bytes of data within said indicated range of number 
of bytes. 

3. In the apparatus set forth in claim 1 further includ- 
ing, in combination: IS 

CKD means for supplying a plurality of CKD data 
blocks; 

said data storage device having a CKJD formatted 
disk for receiving and recording CKD data; 

said selection means being connected to said CKD 20 
means for receiving and selecting a predetermined 
number of said CKD data blocks for creating said 
data transfer units of data blocks; 

said data access means having CKD recording means 
for receiving and recording each of said com- 25 
pressed blocks as a single record on said CKD 
formatted disk; and 

repeat means connected to said selection means and 
to said CKD means for repeatedly actuating the 
CKD means to supply said data transfer units of 30 
data blocks for compression and recording in said 
respective CKD records. 

4. In the apparatus set forth in claim 1 further includ- 
ing, in combination: 

said programmed machine including a host processor 35 
connected to a peripheral controller, said data stor- 
age device being connected to said peripheral con- 
troller; 

an FBA sectored disk in said data storage device, said 
FBA sectored disk having a plurality of address- 40 
able sectors, each said sector being one of said 
addressable data storage areas; and 

said selection means having FBA means for selecting 
said data blocks for creating said data transfer units 
to be recorded in a predetermined number of said 45 
sectors on said FBA sectored disk; and 

repeat means connected to said selection means and 
said compression means for repeatedly actuating 
the selection means and said compression means for 
respectively creating a plurality of said data trans- 50 
fer units of data blocks from said data blocks for 
compression and compressing each of said created 
data transfer units of data blocks as a compressed 
block for recording said compressed blocks on said 
FBA sectored disk such that said file of data blocks 55 
is recorded in compressed form on said FBA sec- 
tored disk in a plurality of said compressed blocks. 

5. In the apparatus set forth in claim 1 further includ- 
ing, in combination: 

an FBA formatted disk in said data storage device, 60 
said FBA formatted disk having a plurality of ad- 
dressable sectors for receiving and recording data, 
each of said sectors having a predetermined data 
storage capacity indicated by a predetermined 
number of data bytes; and 65 

said selection means being connected to the FBA 
formatted disk for responding to said sector data 
storing capacity for selecting said given number of 



bytes of data to be said predetermined number of 
bytes in each of said sectors. 

6. In the apparatus set forth in claim 1 further includ- 
ing, in combmation: 

data recording management means connected to said 
directory means and to said data access means for 
actuating the directory means to establish a plural- 
ity of said file directories, one file directory for 
each said compressed block; 

said recording management means actuating said 
directory means to record in each of said file direc- 
tories a number of said data blocks to be included in 
each of said data transfer units of data blocks and 
including recording a maximum number of bytes to 
be included in any one of said data transfer units; 
and 

said selection means being connected to said direc- 
tory means for reading said mflTimum number of 
bytes and said number of data blocks and respond- 
ing to said read numbers to select said data transfer 
units. 

7. In the apparatus set forth in claim 1 further includ- 
ing, in combination: 

update means connected to said selection means and 
to said allocations means for actuating said selec- 
tion means to update a predetermined one of said 
recorded compressed blocks with updated data 
blocks including receiving updated ones of said 
data blocks and creating a new data transfer unit of 
data blocks to include said updated data blocks; 

said compression means receiving and compressing 
said new data transfer unit into a new compression 
block; and 

said update means connected to said allocation means 
for actuating the allocation means to allocate a 
number of said addressable data storage areas for 
receiving and recording said new compression 
block in said data storage device. 

8. In the apparatus set forth in claim 1 further includ- 
ing, in combination: 

first means in said allocation means for allocating 
predetermined ones of said addressable data storing 
areas for receiving and recording each of said com- 
pressed blocks; and 

second means in said allocation means responsive to 
said first means allocating said predetermined ones 
of said addressable data storage areas to deallocate 
second predetermined ones of said addressable data 
storage areas that recorded respective ones of said 
compressed blocks having identical identifications 
as data blocks recorded in said first means allocated 
predetermined ones of said addressable data stor- 
age areas. 

9. In a machine-effected method of compressing and 
recording data blocks onto a data storage medium hav- 
mg a plurality of addressable data storage areas, includ- 
ing machine-executed steps of: 

first selecting a plurality of data blocks of a file to be 
compressed and recorded on the data storage me- 
diun^ 

second selecting a plurality of submultiples of said 
selected data blocks respectively as a plurality of 
data transfer units; 

estimating a maximum number of said addressable 
data storage areas to be allocated for storing said 
selected plurality of data blocks after compression 
in said data storage medium: 
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allocating said Tnaximum number of said addressable 
data storage areas to receive and store said selected 
plurality of data blocks in a compressed form and 
indicating the allocation of said addressable data 
storage areas; 

compressing and recording each of said data transfer 
units as respective compressed blocks including 
recording each of said compressed blocks as a sepa- 
rately recorded record; and 

creating and maintaining a separate file directory 
indicating the address and size of each of said re- 
corded compressed block for enabling random 
access to each said recorded compressed block 
such that less than an entirety of said file of data 
blocks are retrieved from said data storage medium 
for accessing only predetermined ones of said re- 
corded blocks less than all of said recorded com- 
pressed blocks and modifying said indicated alloca- 
tion to indicate a number of said addressable data 
storage areas storing said compressed block. 

10. In the machine-effected method set forth in claim 
9 fiirther including machine-executed steps of: 

establishing a data storage space management for said 
file including establishing said file directory to 
include indications of a desired size in data bytes of 25 
each of said data transfer unfits and establishing 
one entry in the file directory for each of said re- 
corded compressed blocks. 

11. In the machine-effected method set forth in claim 

9 further including machine-executed steps of: 30 
before recording one of said compressed blocks, allo- 
cating a first number of said addressable data stor- 
age areas of the record medium for recording said 
one compressed blocks; and 
after recording said one compressed data block, deal- 35 
locating a second number of said addressable data 
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storage areas that contain a recorded compressed 
blocks wherein said second number is less than said 
first number. 

12. In the machine-effected method set forth in claim 
9 further including machine-executed steps of: 

supplying CKD formatted data blocks of one CKD 
formatted file and selecting said CKD data blocks 
to be compressed and recorded; 

separately compressing a plurahty of said data trans- 
fer units of said CKD data blocks into respective 
ones of said compressed blocks; and 

recording the plurality of compressed blocks as one 
record on a CKD formatted record member. 

13. In the machine-effected method set forth in claim 
9 further including machine-executed steps of: 

selecting an FBA formatted record medium to be said 
record medium, selecting said FBA formatted re- 
cord medium to have a plurality of addressable 
data-storing sectors, selecting each data-storing 
sector to be capable of recording a given number of 
data bytes; and 

selecting said data transfer units to respectively have 
a first predetermined number of said data blocks 
have a number of uncompressed data bytes equal to 
a data storage capacity, in data bytes, of a second 
predetermined number of said data-storing sectors. 

14. In the machine-effected method set forth in claim 
9 further including machine-executed steps of: 

setting a range of number of bytes to be included in 
each of said data transfer units; and 

selecting a number of said data blocks for inclusion is 
each of said data transfer units such that said se- 
lected nimiber of data bytes for each of said data 
transfer units is within said range. 
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